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Abstract. Prototypes of an annotated scientific text visualizer were designed, 
developed, and deployed. This pedagogic tool is designed to help undergraduates 
draft short research articles that conform to the generic expectations of their 
discourse community. This online tool enables users to discover and explore the 
language features present in short research articles. Users can select to visualize 
research articles in the field of computer science. The articles are categorized into 
four types. Users select to hide or reveal particular language features and their 
associated explanations in text, audio, or video formats. This enables them to create 
their own learning paths with this interactive tool. Students can use the visualizer 
to individualize their own learning interactively at their own pace on materials that 
are relevant to them. 


Keywords: scientific writing, visualization, language features, individualized 


learning, discovery learning. 


ale Introduction 


This paper details the design, development, and deployment of a rapid prototype 
and an alpha release prototype of an annotated scientific text visualizer. This 
interactive pedagogic tool aims to help novice writers with English as an 
additional language by visualizing prototypical language features and providing 
multimedia explanations on demand. The inspiration for this tool stems from 
the noticing hypothesis (Schmidt, 2010), which claims that learners must first 
notice language features before learning them. Noticing is achieved through a 
discovery learning approach (Huang, 2008), enhanced using intelligent computer 
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assisted language learning (Amaral & Meurers, 2011). This is the first interactive 
visualization tool for novice writers of computer science research articles. 


The purpose of this phase in the project was twofold. The first aim was to create 
a simple prototype to act as a visual aid that can be used in conjunction with the 
required specifications to show how the fully-fledged prototype should work. The 
second aim was to identify stakeholder expectations and improve our understanding 
of user needs to ensure that the scientific text visualizer not only meets but exceeds 
user expectations. 


The following section provides the background details to the development of the 
scientific text visualizer. Section 3 describes the design specifications. Section 
4 details the development of the rapid prototype, annotated articles, multimodal 
materials, and the alpha release. Section 5 concludes with the alpha release and 
lists future work. 


2. Background 


Both undergraduate and postgraduate students in the school of computer science 
at the University of Aizu in Northern Japan are required to submit short research 
articles in order to fulfill graduation requirements. This is a particularly onerous 
challenge for Japanese students who may have had little exposure to formal 
written English and less exposure to scientific writing. Ideally, students can 
dedicate a significant amount of time to read research articles in their field of 
research, and acquire the tacit knowledge required to write their own research 
paper. However, given the severe time constraints that many students face, this 
is not a viable option. A key problem for teachers of the associated technical 
writing courses is providing suitable examples and advice for all students. For 
example, within the field of computer science, some students may write more 
theoretical papers that rely on mathematical proofs while other students may 
develop and evaluate software, making it difficult for teachers to use examples 
that are relevant to all students. 


Japanese students with little proficiency in English could make extensive use of 
Google Translate which since its switch to Google Neural Machine Translation 
now produces text that is more comprehensible than texts that those students 
could produce. This combined with the use of Grammarly or similar generic 
error detectors can produce somewhat comprehensible texts. This harnessing of 
technology however has no pedagogic purpose, and so the focus of this online tool 
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is to help writers learn more about the target genre by providing explanations on 
demand in the mode and medium that users prefer. This individualized automated 
support is both technically feasible and eminently scalable. 


Individualized learning can solve the problem of differing needs and differing 
wants of students sharing the same class. Students can select example research 
articles which are most relevant to the type of research they are engaged in. They 
can then select the language features that they want to better understand. As this 
tool is online, writers can access it at will on any web-enabled device. This is 
particularly pertinent as many writers draft the final version of their graduation 
thesis over the New Year holiday period. 


3. Design 


A software requirements specification was created detailing use cases and 
requirements from the perspectives of students, teachers, and researchers. 
Research in computer science may be classified into four categories, namely 
empirical, experimental, practical, and theoretical. Once users select the type 
of research article, and the specific article itself, they can individualize their 
learning by showing or hiding various language features on demand. The features 
incorporated are listed in Table |. For each feature, explanations are provided in 
different modes (text, audio, and video) and mediums (Japanese and English). 


Table 1. Language features to be visualized on demand using toggle buttons 


# Category Details 
1 organization sections and moves 
2 functions description, explanation, exemplification, 
justification, reference to visuals 
3 connections coherence, cohesion 
(e.g. anaphoric and cataphoric pronouns) 
4 linking conjunctions, adverbs (transitions), 
and prepositions 
> tense tense and aspect (e.g. present perfect progressive) 
6 voice passive, active (and ergative) 
fi modality hedging, boosting, and approximation 
8 abstraction packaging processes as nouns 
9 information structure end weight, focus, and flow 
(see Blake, 2015 for more details) 
10 | word type first 1,000 words, second 2,000 
words, academic word list 
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4. Development 


Development can be divided into prototype creation (Prototype I and Prototype I) 
and materials creation (annotated articles and multimodal explanations). These are 
discussed in turn below. 


4.1. Prototype I: Axure RP 


A simple working prototype was made using Axure RP. A dropdown menu enables 
users to select research articles that are displayed in the center of the viewport. A 
row of ten toggle function buttons at the top allow users to hide/reveal language 
features. An exploratory panel appears above the research article when a function is 
selected. The exploratory panel contains an embedded video, a textual description, 
and a dropdown menu of other explanatory modes and mediums for the first 
function. 


4.2. Materials creation 


The initial dataset of 12 texts comprises abridged research articles written by 
undergraduates that were submitted as graduation theses, capstone projects, 
or final projects. Based on user feedback, texts over four pages were abridged. 
Where possible, raw text parsing is used, but when the state-of-the-art accuracy 
is insufficient, annotation tags are needed. For each function that requires 
annotation, html-like tags are used so that rule-based parsing can be used to 
visualize those particular language features. Users expect online learning resources 
to be interactive, highly visual and multimodal (Hafner, Chik, & Jones, 2015). 
Therefore, where possible, explanations are provided in text, image, audio, and file 
formats. Explanatory slideshows were created. Explanations were recorded in both 
English and Japanese to avoid the ‘L2 halting effect’? (Amaral & Meurers, 2011). 
The slideshows and audio files were merged to create videos. 


4.3. Prototype Il 


The fully-fledged code version of the annotated text visualizer, Prototype II, 
allows users to select four types of computer science articles (practical, theoretical, 
empirical, and experimental) from a preloaded database of annotated articles. Users 
select the language features to be visualized on demand using toggle buttons to hide 
and reveal visualizations. When a toggle button is selected, the relevant features in 
the research article displayed in the viewport are highlighted and an explanatory 
panel appears. The explanatory panel is divided into two parts: embedded video 
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area and links to additional video, audio, or text explanations. Explanations are 
currently available in English or Japanese, but other languages may be added. 


5. Discussion and conclusion 


This pedagogic tool gives users the power to explore the form and function with 
visual, audio, and video explanations. Through exploring the visualizations and 
interacting with multimedia explanations, user awareness of generic expectations 
can be raised. This prototype tool is scalable and can be extended to deal with other 
scientific domains and different genres of writing. 


The next phase of this three-year project is to extend the depth and breadth of 
the language features that can be visualized. The next version of the scientific 
text visualizer will be developed by a team of students using the Python Django 
web framework and Vue.js as the students have taken elective courses on these 
technologies. In contrast to the early prototypes, the next version will adopt a 
mobile-first approach. 
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